Partitioning Uncertain Workflows
نویسندگان
چکیده
It is common practice to partition complex workflows into separate channels in order to speed up their completion times. When this is done within a distributed environment, unavoidable fluctuations make individual realizations depart from the expected average gains. We present a method for breaking any complex workflow into several workloads in such a way that once their outputs are joined, their full completion takes less time and exhibit smaller variance than when running in only one channel. We demonstrate the effectiveness of this method in two different scenarios; the optimization of a convex function and the transmission of a large computer file over the Internet.
منابع مشابه
Collaborative Data-centric Workflows: Towards Knowledge centric workflows and Integrating Uncertain Data
The acquisition of data, in particular for scientific data, is more and more organized in complex processes that are captured by workflows. These workflows are often driven by ontologies. For example the collaborative application Spipoll [3] proposes to collect information about pollination in France. The users take pictures of insects on flowers, download them on the application and then ident...
متن کاملPartitioning and Scheduling Workflows across Multiple Sites with Storage Constraints
This paper aims to address the problem of scheduling large workflows onto multiple execution sites with storage constraints. Three heuristics are proposed to first partition the workflow into sub-workflows. Three estimators and two schedulers are then used to schedule subworkflows to the execution sites. Performance with three real-world workflows shows that this approach is able to satisfy sto...
متن کاملA Bayesian Approach to the Partitioning of Workflows
When partitioning workflows in realistic scenarios, the knowledge of the processing units is often vague or unknown. A naive approach to addressing this issue is to perform many controlled experiments for different workloads, each consisting of multiple number of trials in order to estimate the mean and variance of the specific workload. Since this controlled experimental approach can be quite ...
متن کاملDensity-Based Clustering Based on Probability Distribution for Uncertain Data
Today we have seen so much digital uncertain data produced. Handling of this uncertain data is very difficult. Commonly, the distance between these uncertain object descriptions are expressed by one numerical distance value. Clustering on uncertain data is one of the essential and challenging tasks in mining uncertain data. The previous methods extend partitioning clustering methods like k-mean...
متن کاملRemodelling Scientific Workflows for Cloud
In recent years, cloud computing has raised significant interest in the scientific community. Running scientific experiments in the cloud has its advantages like elasticity, scalability and software maintenance. However, the communication latencies are observed to be the major hindrance for migrating scientific computing applications to the cloud. The problem escalates further when we consider ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1507.00391 شماره
صفحات -
تاریخ انتشار 2015